Memory Hardware Support for Sparse Computations

نویسندگان

  • Arnold J. Niessen
  • Harry A. G. Wijshoff
چکیده

Address computations and indirect, hence double, memory accesses in sparse matrix application software render sparse computations to be ine cient in general. In this paper we propose memory architectures that support the storage of sparse vectors and matrices. In a rst design, called vector storage, a matrix is handled as an array of sparse vectors, stored as singly-linked lists. Deletion and insertion of a vector is done rowor column-wise only. In a second design, called matrix storage, a higher level of sophistication is achieved. A sparse matrix is stored as a bi-directionally threaded doubly-linked list of elements. This approach enables both rowand column-wise operations. Reading a row (column) can be done at the speed of one element (real value and indices) per memory cycle, while extracting or updating takes 2 memory cycles. Inserting an element can be done once every 2.5 memory cycles. A pipelined variant with 3-fold interleaved memory and write bu ers yields higher e ciency, close to one sparse matrix element per memory cycle for all basic vector operations. In-memory operations also decrease the burden on processor, cache, and bus. 4

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Performance Linear Algebra Processor using FPGA

With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible to use these devices to build special purpose processors for floating point intensive applications that arise in scientific computing. FPGA provides programmable hardware that can be used to design custom hardware without the high-cost of traditional hardware design. In this talk we discuss two multi-proc...

متن کامل

Eecient Support of Parallel Sparse Computation for Array Intrinsic Functions of Fortran 90 *

Fortran 90 provides a rich set of array intrinsic functions. Each of these array intrinsic functions operates on the elements of multi-dimensional array objects concurrently. They provide a rich source of parallelism and play an increasingly important role in automatic support of data parallel programming. However, there is no such support if these intrinsic functions are applied to sparse data...

متن کامل

Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures

Graphics processors are increasingly used in scientific applications due to their high computational power, which comes from hardware with multiple-level parallelism and memory hierarchy. Sparse matrix computations frequently arise in scientific applications, for example, when solving PDEs on unstructured grids. However, traditional sparse matrix algorithms are difficult to efficiently parallel...

متن کامل

A MODIFIED STEFFENSEN'S METHOD WITH MEMORY FOR NONLINEAR EQUATIONS

In this note, we propose a modification of Steffensen's method with some free parameters. These parameters are then be used for further acceleration via the concept of with memorization. In this way, we derive a fast Steffensen-type method with memory for solving nonlinear equations. Numerical results are also given to support the underlying theory of the article.  

متن کامل

A GPU-Adapted Structure for Unstructured Grids

A key advantage of working with structured grids (e.g., images) is the ability to directly tap into the powerful machinery of linear algebra. This is not much so for unstructured grids where intermediate bookkeeping data structures stand in the way. On modern high performance computing hardware, the conventional wisdom behind these intermediate structures is further challenged by costly memory ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994